Reducing Bias and Quantifying Uncertainty in Fluorescence Produced by PCR

We present a new approach for relating nucleic-acid content to fluorescence in a real-time Polymerase Chain Reaction (PCR) assay. By coupling a two-type branching process for PCR with a fluorescence analog of Beer’s Law, the approach reduces bias and quantifies uncertainty in fluorescence. As the two-type branching process distinguishes between complementary strands of DNA, it allows for a stoichiometric description of reactions between fluorescent probes and DNA and can capture the initial conditions encountered in assays targeting RNA. Analysis of the expected copy-number identifies additional dynamics that occur at short times (or, equivalently, low cycle numbers), while investigation of the variance reveals the contributions from liquid volume transfer, imperfect amplification, and strand-specific amplification (i.e., if one strand is synthesized more efficiently than its complement). Linking the branching process to fluorescence by the Beer’s Law analog allows for an a priori description of background fluorescence. It also enables uncertainty quantification (UQ) in fluorescence which, in turn, leads to analytical relationships between amplification efficiency (probability) and limit of detection. This work sets the stage for UQ-PCR, where both the input copy-number and its uncertainty are quantified from fluorescence kinetics. Supplementary Information The online version contains supplementary material available at 10.1007/s11538-023-01182-z.


Introduction
Polymerase Chain Reaction (PCR) is a hallmark of molecular biology and applied genetics. When the dynamics of PCR are monitored by a fluorescent probe, the initial amount of target sequence can be quantified (qPCR) by a computational algorithm equipped with a mathematical model and a set of control experiments (Ruijter et al. 2009;Lievens et al. 2012;Zhao and Fernald 2005;Peirson et al. 2003;Tichopad et al. 2003;Boggy and Woolf 2010;Guescini et al. 2008;Ruijter et al. 2013). Quantification by PCR is routinely exploited in many applications, including analysis of forensic evidence (Nicklas and Buel 2003;Bauer 2007), monitoring of food safety (Elizaquível et al. 2014), and clinical diagnostics (Kaltenboeck and Wang 2005;Bustin et al. 2021).
The accuracy and precision of the quantification process is limited by the mathematical model relating DNA content to fluorescence. Current models possess subjective and systematic bias and do not account for the uncertainty in fluorescence that arises from imperfect amplification and pipetting errors.
Systematic bias originates from assuming that the initial DNA type is doublestranded and that the fluorescence increases each time either complementary strand is replicated. The former is obviously not true when the initial DNA is produced by reverse-transcription of single-stranded RNA (RT, as in RT-qPCR), as only one of the complementary DNA strands is present at the beginning of PCR. The second statement is not true for common probes that possess a fluorophore covalently attached to an oligonucleotide. Since the oligonucleotide only hybridizes to one of the complementary strands, the fluorescence only increases when one of the complementary strands is replicated. The second assumption also does not appear to be true for fluorescent dyes that bind non-specifically to DNA, as the amount of dye bound to each DNA strand depends on the amount of DNA in solution.
The impact of several of these assumptions was assessed by Ruijter et al. (2014). The authors found that, depending on the initial DNA type and fluorescent probe chemistry, the background-subtracted fluorescence could differ by up to a factor of 2 in the exponential phase. However, the authors' analysis was rooted in the assumption of perfect amplification. They also noted that what actually occurs during the first few cycles of PCR is unknown.
Another source of systematic bias in the mathematical description of fluorescence arises when the initial DNA content is very small. Current approaches are deterministic and do not take into account the fact that the number of DNA strands is an integer. While the kinetics of PCR have been investigated in the framework of stochastic branching processes (Nedelman et al. 1992;Sun 1995;Weiss and von Haeseler 1995;Stolovitzky and Cecchi 1996;Jacob and Peccoud 1996b, a), the first of which was published in this journal, such models have not been linked to the fluorescence reported by probes. Like the deterministic approaches described above, these stochastic models neither discriminate between complementary strands nor describe initial conditions encountered in RT-qPCR. A mathematical model that discriminates between complementary DNA strands can investigate another source of bias: the assumption that the efficiency of synthesis is independent of directionality (i.e., reverse or forward). Since primers are complementary to different ends of the target sequence, and are specifically chosen not to be complementary to each other (i.e., avoiding dimerization), the formation of one primer-target complex may be more efficient than the other. In addition, the yield of the strand whose replication is being monitored by the fluorescent probe may be affected by the monitoring process. These arguments are also supported by the fact that optimal concentrations of each primer can be different (see, for example, Bustin (2004), where the two concentrations differ by a factor of 3).
To address these challenges, we present a two-type stochastic branching process model in Sect. 2.1 that differentiates between complementary strands and amplification probabilities. Analysis of the expected value in Sect. 2.2 identifies a new timescale that is prevalent during the first few cycles. This timescale explains some of the unknown behavior that occurs during the first few cycles of PCR, explaining some of the aforementioned unknown behavior. At short times, there is a lag in exponential growth where the ratio of expected strand counts changes from its initial to critical value. The critical ratio is related to the amplification probability of each complementary strand, being unity when the probabilities are identical. The analysis also demonstrates that the popular parameter describing PCR efficiency (or amplification probability) is really the geometric mean of the efficiencies of both complementary strands.
Quantification by real-time PCR is also limited by a subjective and empirical description of the fluorescence that is not associated with amplification, or the background fluorescence. The description of background fluorescence is usually taken a posteriori (i.e., after measurements of fluorescence monitoring amplification). Without a clear connection to the chemical and physical processes occurring in solution, the background fluorescence is often assumed to be a linear function of cycle, or a 'baseline.' In Sect. 3.1, we address these concerns by using the fluorescence analog of Beer's Law to relate fluorescence to the concentration of each fluorescent species. While such expressions have often been used to describe the fluorescence of dyes interacting with known amounts of DNA (Biver et al. 2003(Biver et al. , 2005, we are not aware of any adaptation to real-time PCR. We discriminate between the fluorescent species by referring to the form present before PCR as the inactive species and the form activated by PCR as the active species. For hydrolysis probes, we show how the relevant parameters can be extracted from a few control experiments (see Sect. 3.2). In contrast to other approaches, we can quantify the validity of the model of background fluorescence. The relevant contributions can be determined without interrogating or adjusting fluorescence data associated with amplification. We find that the model agrees well with experiment and observe that the incremental increase in fluorescence is not independent of cycle.
A final limitation of real-time PCR is the lack of a mathematical expression relating errors arising from pipetting and imperfect amplification to uncertainty in fluorescence. To quantify the variance in copy number , we investigate the stochastic branching process (Sect. 2.3) in a manner similar to previous reports analyzing error in high-throughput sequencing (Kebschull and Zador 2015;Schwabe and Falcke 2022). After validating that the fluorescence parameter (the fluorescence per mole) for each species is approximately constant for each cycle and well (Sect. 3.2), the models for PCR and fluorescence are combined (Sect. 3.3). This yields analytical expressions of the first two central moments of fluorescence in terms of reaction efficiencies, input content, and input type (i.e., double-stranded DNA, forward-stranded RNA, and reverse-stranded RNA).
Together with the parameters determined from experiment, fluorescence curves computed with uncertainty identify regimes under which certain sources of error are more prevalent than others (see Sect. 3.3). When the expected initial-strand-number is sufficiently large, or the cycle number sufficiently small, the error in fluorescence in a specific well is less than the well-to-well variation in expected value. As the initial strand count decreases and the fluorescence rises above initial levels, however, the variance in input copy-number and imperfect amplification become the dominant contributions to error. Finally, in Sect. 3.4, we use the fluorescence model to develop analytical expressions for the limit of detection as a function of amplification efficiency and nucleic acid type. These expressions may be particularly useful for application in epidemic diseases, as false positives or false negatives may be instead termed inconclusive.

Strand-Specific Branching Process
In this section, we model PCR as a two-type branching-process. The model distinguishes between complementary DNA strands and amplification efficiencies. We then derive analytical expressions relating the first two central-moments of strand counts before PCR to those after each cycle has completed. Compartmentalizing DNA amplification and fluorescence, the linking of the two phenomena is postponed until Sect. 3.

Mathematical Model
PCR consists of a series of n cycles, with n usually ranging between 35 and 50. Each cycle consists of a melting, annealing, and elongation step to synthesize new DNA from existing DNA (i.e., a chain reaction). A variety of resources on PCR are available online for further information (e.g., National Institutes of Health, National Human Genome Research Institute 2023).
To distinguish between the two complementary strands of DNA, we refer to one as the forward strand and the other as the reverse strand. We let the discrete random variables X i and Y i represent the number of forward and reverse strands, respectively, present after i = 0 to n cycles have been completed. We represent the strand counts after completion of i cycles as the random vector and refer to U 0 as the initial strand count. However, we distinguish U 0 from the strand count input to the reaction mixture, denoting the latter by the random vector I X I Y , where I X and I Y are discrete random variables representing the number of forward and reverse strands input, respectively. This is important to distinguish between assays targeting DNA and RNA sequences, as discussed in Sect. 2.1.2.

Relationship Between Consecutive Cycles
After completing i − 1 cycles, the biochemical events occurring during the next cycle involve the attempt to produce one forward strand from each of the Y i−1 reverse strands and the attempt to produce one reverse strand from each of the X i−1 forward strands (see Fig. 1). The outcome of synthesis of a forward strand from each reverse strand is modeled as a Bernoulli random-variable with probability of success p rf ∈ (0, 1). Similarly, the outcome of synthesis of a reverse strand from each forward strand is modeled as a Bernoulli random-variable with probability of success p fr ∈ (0, 1). (The subscripts rf and fr denote the direction reverse-to-forward and forward-to-reverse, respectively.) This corresponds to the mathematical model after completing i cycles, where and B (a; b) denotes a Binomial random-variable of a trials with probability of success b. All Bernoulli trials, whether associated with the outcome of synthesis of a forward or reverse strand, are taken to be independent. To compare to previous approaches that do not discriminate between complementary strands and efficiencies, we will use to denote the total number of strands after i cycles have been completed. We will see that an appropriate characterization of the average amplification efficiency of both complementary strands isp and that an appropriate parameter for the deviation in efficiencies is To avoid changing notation, we will subsequently only investigate p fr =p R and p rf =p R in terms ofp and R, so that

Relationship Between Initial and Input Condition
The relationship between the input number I X I Y and initial number U 0 of strands depends on whether the nucleic acids input to the reaction mixture are forward-stranded RNA (fs-RNA, referred to as Case RF), reverse-stranded RNA (rs-RNA, referred to as Case RR), or double-stranded DNA (ds-DNA, which consists of both fs-DNA and rs-DNA, referred to as Case D). If ds-DNA is generated by transferring rs-DNA and fs-DNA separately into the reaction mixture, X 0 and Y 0 can be modeled as independent and identically distributed (i.i.d.). Here, represents number of fs-DNA (or rs-DNA) strands.
When RNA is input to the reaction mixture, on the other hand, the nucleic acids all possess the same strandedness (i.e., they are all fs-RNA or rs-RNA). The RT step yields DNA strands that are complementary to the RNA. As all RNA strands are fs-RNA (or rs-RNA), the RT step yields rs-DNA (or fs-DNA). Modeling the outcome of synthesis of each cDNA from each RNA strand as a Bernoulli random-variable with probability of success r ∈ (0, 1), U 0 is related to , In comparison to the conventional PCR amplification efficiency p := p fr = p rf when R = 1, which is usually between 0.8 and 0.99, the RT efficiency r can adopt a relatively large range of values (Bustin et al. 2015;Schwaber et al. 2019).
The nucleic acids are input to the reaction mixture by transferring liquids from one container to another using a pipette. Since the process of transferring such liquids is independent of the type of nucleic acids (i.e., independent of whether they are fs-RNA, rs-RNA, fs-DNA, or rs-DNA), the number of strands of each type input to the reaction mixture obey the same distribution. We will let this distribution be obeyed by the discrete random variable I . As a result, it follows that where i.d. denotes identically distributed.

Expected Value
In this section, we derive relationships between E [I ] and expected copy-numbers after i cycles have been completed. After using the total law of expectation to obtain a relationship in expected copies between two sequential cycles, we use induction to generate the desired results. We first investigate the conventional branching process for PCR which results from assuming R = 1 and does not distinguish between fs-DNA and rs-DNA. Subsequently, we consider the two-type branching process (1) where R may not be 1 and fs-DNA is distinguished from rs-DNA.

Conventional Branching Process
The conventional branching process model for PCR occurs when R = 1, implying from (3) that p rf = p fr =: p. In this case, the system (1) can be summed to yield as has been investigated elsewhere (Nedelman et al. 1992;Sun 1995;Weiss and von Haeseler 1995;Peccoud and Jacob 1996;Jacob and Peccoud 1996b, a;Stolovitzky and Cecchi 1996). Using the law of total expectation and (4), one finds that for any two consecutive cycles. From induction, it follows that (5) is identical to as reported elsewhere (Nedelman et al. 1992;Sun 1995;Weiss and von Haeseler 1995;Peccoud and Jacob 1996;Jacob and Peccoud 1996b, a;Stolovitzky and Cecchi 1996).

Strand-Specific Branching Process
Below, we develop relationships between E [I ] and each E [U i ] for the more general case of (1) where forward strands are distinguished from reverse strands and R may not be 1. The law of total expectation and (1) lead to the relation where A is defined as and the last step follows from induction. The relationship between E [U 0 ] in (7) depends on E [I ] through ] and the case-by-case relationships presented in Sect. 2.1.2. We will see that the matrix A plays a central role in dynamics of the first two central-moments of strand counts. A has two distinct eigenvalues, and can be decomposed as where If R = 1, the scale factor 1 √ 2 in the definition of X implies that where · denotes the Euclidean norm. If R = 1, on the other hand, no scale factor can be chosen to satisfy (10).
The quantities X i + RY i and X i − RY i in the second expression of (13) arise frequently in the investigation of the first two central-moments of U i . The quantity X i + RY i is referred to as the weighted sum of strand counts after i cycles have been completed, while X i − RY i is referred to as the weighted difference. Multiplying each side of the equation (13) As λ 1 > 1, the expected weighted-sum exhibits exponential growth. As λ 2 < 1, on the other hand, the expected weighted-difference exhibits exponential decay. The term in (13) associated with λ 2 is always present when the input is RNA (i.e., for Case RF or RR and anyp ∈ (0, 1) and R ∈ (0, ∞)), as E [X 0 ] = RE [Y 0 ]. However, it may not be present if the input is DNA, as the term vanishes ( , the term involving λ i 2 is usually negligible after a few cycles, asp is usually above 0.8. After the lag time is over (i is large enough for λ i 2 to be negligible), the ratio of expected forward to reverse strand counts reaches its critical value, as where e 1 = 1 0 and e 2 = 0 1 are the standard unit vectors. An illustration of the Fig. 2 To compare the two-type branching process to (6), we compute from (13) and, for Case D, as (8) (15) demonstrates that the two approaches are equal if R = 1. Otherwise, the absolute difference between E [N i ] calculated from (6) and (15) increases exponentially with increasing i. In the future, determining R and using (15) instead of (6) may therefore be important to reduce bias in quantification.
On the other hand, when R = 1, the expected amount of each strand is eventually independent of the composition at the start of PCR. That is, only depends on the expected initial-sum, or E [N 0 ]. This result is in contrast to the report of Ruijter et al. (2014), who used a deterministic model with perfect amplification efficiency. When R = 1 and the input is RNA, the directionality of RNA (i.e., fs-RNA or rs-RNA) only dictates the characteristic length of the lag time.

Variance
In this section, we derive the relationship between Var [I ] and the variance after i cycles have been completed. The procedure is similar to the previous section, except the law of total variance is used instead of the law of total expectation.

Conventional Branching Process
Using the law of total variance and (4), one obtains Induction can be used to show that (17) is equivalent to Substitution of (6), simplification of the resultant geometric series, and rearrangement leads to the expression When Var [N 0 ] = 0, Equation (19) becomes identical to what has been reported elsewhere (Sun 1995;Weiss and von Haeseler 1995;Jacob and Peccoud 1996b, a;Stolovitzky and Cecchi 1996) (Nedelman et al. (1992); Peccoud and Jacob (1996) report the leading-order approximation for large i The growth in variance with increasing i by an exponent twice that of the expected value explains why very large cycles, where the expected copy-number is also very large, are not of interest. It also explains why quantification by real-time PCR is more reproducible than end-point PCR.

Strand-Specific Branching Process
The variance-covariance matrix of U i is defined as for each i. From the law of total variance, In a manner similar to (7), it follows that In addition, after substitution of (1), it follows that Combining the two expressions, we obtain the two-type analog of (17), Before using induction to relate Var provided by the first expression of (13). Collecting terms multiplying each eigenvalue, we find that where From induction, it follows that (21) is identical to Substitution of (12) and simplification of the resultant geometric series leads to (see Sect. B.1) where In (24)  =Var where (26b) utilizes the law of total variance. Since X 0 and Y 0 are independent for any case, Cov [X 0 , Y 0 ] = 0. As such, we will often use the substitution without specifying a particular case. In contrast to (19), which only has terms proportional to (1 + p) 2i and (1 + p) i−1 , Equation (24) has terms proportional to λ 2i While physical interpretation of all terms can be complicated, it is useful to examine the leading-order expression as i → ∞, and compute the terms explicitly as where we have used (27). Note that the moments of (X 0 ± RY 0 )/2 arise in (28) as they did in (13). The term ν 1,1 represents the contribution from the variance at the start of PCR. The term η 1,1 accounts for the variance due to imperfect amplification, asp ≈ 1 implies that one-half the second term in brackets of (19). The third term η 1,1 does not have a counterpart in (19). It accounts for differences in strand-specific amplification, as it vanishes when R → 1.
The expressions developed for expected value and variance can be used to produce expressions in the squared coefficients of variation, defined as where denotes element-wise division. Substitution of (13) and (28) into (30) yields As the term O λ −i 1 rapidly approaches zero with increasing i, and λ 1 is usually more than 1.8, the dominant term in (31) is a useful approximation. The dominant term is independent of i and is therefore a very practical tool for estimating the error present in PCR. As such, it is useful to express (31) as for any j, k ∈ {1, 2}, where (27) is used in the first step and the second step follows from simplification with (8) and (26). The quantities α and β, defined as relate the reaction efficiencies to the coefficient of variation, as α = α (R) and β = β (R,p, r ). In Sect. 3.4, we will see that (32) also plays a central role in the limit of detection.

Mathematical Model
To adapt the approach to the fluorescence measured in real-time PCR, it is necessary to relate the DNA content in solution to the monitoring chemistry. When a fluorescent probe is used to monitor the kinetics of PCR, the inactive and active probe species usually make significant contributions to fluorescence. With these two fluorescent species, the fluorescence analog of Beer's Law is Fig. 3 Illustration of relationship between successful DNA replication and changes in fluorescence associated with a hydrolysis probe binding to the reverse strand. (a) As polymerization begins, the hydrolysis probe binds to the reverse strand (blue, dashed line). The probe is in its inactive state, where fluorescence emitted by the fluorophore ('F') is quenched by the quencher ('Q') in close proximity. (b) As polymerization (green arrow) reaches the location of the probe, the probe is hydrolyzed. (c) After successful production of a forward strand (orange, solid line) from a reverse strand, the fluorophore is activated, as it is no longer in close proximity to the quencher. When a reverse strand is produced from a forward strand, however, the fluorescence does not change (Color figure online) after 2 each cycle i = 1 to n and for each well w = 1 to m. Here, F i,w is the fluorescence measured, f − i,w (or f + i,w ) is a constant representing the fluorescence per mole of inactive (or active) probe, and C − i,w (or C + i,w ) is the molar concentration of inactive (or active) probes. Each molar fluorescence, f − i,w or f + i,w (also denoted as f ± i,w ), may depend on i due to photobleaching. Each may also depend on w due to spatial variation in electronics and temperature. The terms F i,w , C − i,w , and C + i,w are random variables through their dependence on DNA content (see below).
Assuming that probe is not degraded during cycling, for all i and w, where C is a known constant representing the total concentration of probe in solution. As DNA is replicated, inactive probe is converted to active probe in a manner that depends on the reaction stochiometry. For hydrolysis probes, an inactive probe is activated, or hydrolyzed, each time one of the complementary strands is replicated. Without loss of generality, we consider the case where the hydrolysis probe binds to the reverse strand (see Fig. 3). After completing i PCR cycles, the concentration of active probe in each well w is then where V and N a are constants representing the volume of solution and Avogadro's number, respectively, and X i,w := X i,w − X 0,w . We assume for each i = 0 to n that X i,1 , . . . , X i,m are independent and distributed identically to X i , and similarly that Y i,1 , . . . , Y i,m are independent and distributed identically to Y i . By combining (34), (35), and (36), the fluorescence model becomes represents the contribution of background fluorescence, and represents the increase in fluorescence per synthesis of forward strand. From (37), the first two central-moments of F i,w are where  (8) and (26) From (13), (28), and (41), it follows that (40) can be expressed as and as i → ∞. As the dominant term of Var F i,w is proportional to λ 2i Instead, the dominant term of Var F i,w arises from Var [X i ] as in (28). As in (32), Equations (42) and (43) imply where α = α (R) and β = β (R,p, r ) were defined in (33) for Case D, RF, and RR.
In contrast to other models (e.g., Ruijter et al. 2013Ruijter et al. , 2014, the fluorescence model (37) is consistent with stoichiometric reactions involving hydrolysis probes and DNA polymerase. By using the fluorescence analog of Beer's Law, it provides a physical basis for description of the background fluorescence b i,w . Finally, unlike conventional approaches, the more mechanistic model demonstrates that d i,w may depend on cycle i.
When R = 1, R complicates the relationship between DNA content and fluorescence. However, for Case RR, Equations (8), (13), and (40a) lead to which is independent of R. Since the choice of forward and reverse strands was arbitrary, this demonstrates that the monitoring probe can be chosen so that E F i,w is independent of R, and may be a useful design-rule for RT-qPCR assays.

Extraction of Molar Fluorescence
In real-time PCR, control experiments containing all reagents except nucleic acid template are often performed to check for contamination. In this section, we will show how they can also be used to calculate f ± i,w . Since template is not present (i.e., U 0 = 0), amplification does not occur and the probe cannot be activated. Equation (37) with (38) becomes where F i,w is instead deterministic. After filling m wells of a PCR plate with inactive probe at known C (and appropriate solvation environment) and measuring F i,w after each i ≥ 1, f − i,w can be calculated pointwise from (45) by division. However, to get a more realistic estimate of each f − i,w , the experiment can be repeated with q different plates having a total probe concentration C 1 < · · · < C q in all m wells. With these experiments yield the measurements for each cycle i and well w. Under the assumption that F j i,w C j for j = 1 to q are i.i.d. to a normal distribution, we can compute f − i,w via Fig. 4 Total number of cycles and wells for all plates (count, vertical axis) possessing values of σ ± i,w / f ± i,w in a certain interval (bin, horizontal axis). The bin widths are obtained from the Freedman-Diaconis rule. All counts for σ − i,w / f − i,w > 0.022 correspond to w = 2 (well A2, see Fig. S2 and Table S2 in the SI) The same procedure can be used to calculate f + i,w after performing the identical experiments with active probe instead of inactive probe. The standard deviation in f ± i,w can be estimated pointwise by Using hydrolysis probes, we selected the fluorophore ('F' in Fig. 3) to represent the active probe. We used q = 4 concentrations for the active probe, and q = 3 concentrations for the inactive probe. Additional details of the experimental procedure can be found in Appendix A. Visual comparisons between the model and experimental data can be found in Figs. S1 to S96 in the Supplementary Information (SI). The pointwise values of f ± i,w and σ ± i,w are tabulated in Tables S1 to S96 of the SI. To assess the validity of approximating the measured fluorescence by f ± i,w C for each cycle i and well w, the coefficient of variation, or , was calculated. A histogram of all values (i.e., all n cycles, all m wells, and all q plates) for each probe species is depicted in Fig. 4. For the active probe, the coefficient of variation is often very small, typically much less than 0.01. For the inactive probe, the coefficient of variation can be larger but is still often less than 0.02. This is an indication that the model is realistic.
Having validated the use of the Beer's Law analog in describing the fluorescence, we used the molar-fluorescence parameters in Fig. 5 to assess how the background b i,w and incremental increase d i,w change with cycle i (for a fixed well w). Here, we see that b i,w is not a linear function of i. In fact, for several w, b i,w possesses a Since the fluorescence of an active probe is larger than the fluorescence of an inactive probe, d i,w is positive. Figure 5 also illustrates that d i,w is not independent of cycle; instead it often increases with i. This reveals another source of systematic error, as most models assume that d i,w is independent of cycle (see, e.g., Ruijter et al. (2009), Equation (4); Lievens et al. (2012), Equation (7); Liu and Saint (2002), Equation (2)).

Calculation of Fluorescence Profiles
Having calculated f − i,w and f + i,w pointwise through (46), we leverage (40) with (8), (13), (24), (26), and (41) to compute fluorescence curves with uncertainty. We prescribe common values for assay parameters C, V,p, and R and assume E [I ] is known. However, we also need to specify the relationship between Var [I ] and E [I ], as well as the distribution of each F i,w .
Our assumption on the distribution of each F i,w is rooted in the characteristic values of the fluorescence parameters b i,w and d i,w . As Fig. 5 demonstrates that b i,w and d i,w are typically around 1 and 10 −6 , respectively, this implies with (40a) that E [ X i ] should be more than 10 6 for E F i,w > b i,w . That is, the expected number of successful Bernoulli trials over all i cycles should be more than 10 6 for the fluorescence to reach levels above background. With such a large sample size, it is natural to invoke the central limit theorem and assume that F i,w obeys a normal distribution with mean E F i,w from (40a) and variance Var F i,w from (40b).
In Fig. 6, fluorescence curves are computed with uncertainty for ds-DNA for w = 13. Different E [I ] ranging from 64 (top-left subplot) down to 4 (bottom-right subplot) are investigated. The expected fluorescence depicts behavior that is characteristic of the background and exponential phase observed in typical measurements. During the initial cycles, the term b i,w is much larger than d i,w E [ X i ] and only small changes in fluorescence are observed. Here, the fluorescence is in the background regime. After more cycles are performed, however, the expected value of fluorescence increases exponentially with cycle. The change in E F i,13 with E [I ] is also in line with typical trends. As E [I ] is decreased, the expected value of the fluorescence in the exponential phase shifts to the right. In other words, more cycles are required to reach the same expected fluorescence value.
This approach provides quantitative estimates of sources and magnitudes of uncertainty in different regimes, which are difficult to determine from replicate experiments alone. For large E [I ] and small i, the well-to-well variation in expected value (lightgrey, shaded regions in Fig. 6) is larger or comparable to the error in fluorescence in each well. As such, spatial variation has a significant impact on the error. After many cycles have been completed, on the other hand, the uncertainty in fluorescence is strongly dependent on the expected initial copy-number, increasing drastically with decreasing E [I ]. At E [I ] = 4, the fluorescence does not reach values that are larger than the background fluorescence by an amount that is statistically significant (for κ = 3). This observation suggests that quantifying the uncertainty in fluorescence can provide limitations on the measurement.

Limit of Detection
After performing n PCR cycles, the fluorescence produced by PCR is only useful if it is larger than background levels by a statistically significant amount. Requiring the increase to be at least some 0 < κ < ∞ standard deviations, this amounts to the constraint for some well w. Equation (49) an expression that is no longer dependent on w. Since CV [ X n ] ≥ 0 and κ > 0, it follows that In addition, since (13), (28), and (41), and n typically ranges from 35 to 50, the error in approximating CV [ X n ] 2 by CV [X n ] 2 is extremely small, often less than machine precision. As a result, the left-hand-side of (50) is expressed as the term on the right-hand-side of (32), or where α and β are defined in (33) for Case D, where the input is ds-DNA; Case RF, where the input is fs-RNA; and Case RR, where the input is rs-RNA. If we let I satisfy (48), Equation (51) can be rearranged to The limit of detection, L, or the smallest expected-initial-copy-number that can be detected reliably, is then The largest coefficient of variation in I that can be detected, M, is estimated from (53) and (48), or To compute typical values of L and M, we evaluated them as in (53) and (54) with χ = 1, κ = 3, for 100 equally-spacedp ∈ [0.8, 0.99], R ∈ [0.9, 1.1], and r ∈ [0.2, 0.99] (including endpoints). For I representing ds-DNA (Case D), we find that L is either 5 or 6, corresponding to M of 0.447 and 0.408, respectively. For I representing RNA as in Case RF or Case RR, L ranges between 9 and 52, corresponding to M of 1/3 and 0.139, respectively. The range of L is the same for fs-RNA and rs-RNA.

Conclusions and Future Work
In this work, we presented a new model for fluorescence in real-time PCR that reduced bias and quantified uncertainty. Distinguishing between complementary strands provided a stoichiometric description of fluorescence reported by hydrolysis probes and permitted application to initial conditions encountered in RT-qPCR. Viewing the fluorescence as a Beer's Law analog enabled the background fluorescence to be determined without extrapolation or assuming a certain relationship with cycle. It also allowed for measurement and calculation of background fluorescence without adjusting amplification data. Incorporating the variance in copy number into the fluorescence model enabled quantification of fluorescence uncertainty and analytical expressions for the limit of detection.
In addition to their practical utility, the two-type branching-process and repurposed fluorescence-model provided new intuition on the physics and chemistry in PCR. At short times, there is a lag in exponential growth (usually at most 5 cycles) as the ratio of expected strand counts changes from its initial to critical value, R. The quantity R represents the square root of the ratio of the two synthesis efficiencies (see (3)). In constrast to a previous report investigating deterministic models, we found that the initial composition only impacts the dynamics after the lag phase if R = 1.
The variance in the fluorescence is dominated by a term that increases exponentially by twice the factor of the expected value. This explains, in part, why quantification by end-point PCR is not reproducible. The three terms dominating the variance were attributed to arise from the initial variance (ν 1,1 , see (29a)), imperfect amplification (η (1) 1,1 , see (29b)), and deviation in directional efficiencies (i.e., R = 1; see η (2) 1,1 in (29c)). The fluorescence model for hydrolysis probes demonstrated that the background fluorescence originates from the molar fluorescence of the inactive probe times the total concentration of probe. The incremental increase in fluorescence is proportional to the difference in molar fluorescence between active and inactive probe and, like the background fluorescence, is neither independent of cycle nor a linear function of cycle.
The stochastic view of PCR explains, in part, why deterministic methods that use reaction-specific amplification probabilities are generally less accurate (Ruijter et al. 2013). This is because, for each well w and cycle i, N i,w /N i−1,w = 1 + p (see Equation (5)). Even if R = 1 and N i,1 , . . . , N i,m are independent and distributed identically to N i for each i, this is not necessarily true because not every realization of a random variable is equal to its expected value.
While this work applied the stochastic model of PCR to the fluorescence reported by hydrolysis probes, it can readily be extended to other chemistries. For example, for probes that anneal to forward-stranded DNA, (37) instead becomes For these probes, the fluorescence is measured during the annealing portion of each cycle where only i − 1 cycles of PCR have been completed.
To capture the fluorescence reported by DNA-binding dyes, on the other hand, an extension to the model is needed. This is because the amount of dye bound to a DNA strand depends on the total amount of DNA present in solution (this includes DNA that is not template, like primers (Ruijter et al. 2009)). Application to fluorescent dyes represents an interesting direction for future generalizations of the fluorescence model.
Finally, our approach in this work focused on quantifying the dynamics and uncertainty of fluorescence when the initial amount of each complementary strand is known, as well as their amplification probabilities. (That is, we assumed that E [I ], Var [I ],p, R, and r were known.) However, the ultimate goal of monitoring the kinetics of PCR by fluorescent probes is to infer E [I ], the expected input number. As such, it is of interest to extend the approach to UQ-PCR, or uncertainty quantification of the initial amount of DNA. To this end, another direction for future work is the investigation of the probabilistic nature of PCR in backwards time (see Fig. 7). Diseases (2020) 2019-nCoV_N1 assay, was obtained from Thermo Fisher Scientific. Working solutions of probe were prepared by adding TE −4 to a portion of the stock solution to yield a solution of concentration 5 μmol/L probe.
In a typical experiment, a solution was prepared with 25 vol. % TE −4 at a specific concentration of either 6-FAM (active probe) or TaqMan probe (inactive probe). After mixing, 20 μL was transferred into each well of a 96-well plate. The plate was covered with adhesive film and centrifuged. After assessing that air bubbles were not visible, the plate was placed in an Applied Biosystems 7500 HID Real-Time PCR instrument. The thermal cycling protocol consisted of a 2 min holding stage at 55 • C (328 K), followed by 45 cycles. Each cycle consisted of 30 s at 55 • C (328 K), followed by 3 s at 95 • C (368 K). Data collection occurred during the 55 • C step in each cycle. The raw data through filter 1 was exported with HID Real-Time PCR Analysis Software, Version 1.2 (Applied Biosystems). The fluorescence values were divided by 10 6 before subsequent analysis.

B.1 Derivation of (24) from (23)
Consider some B ∈ R 2×2 . From (12), we can write where T j,k : R 2×2 → R 2×2 is the linear operator Application of (B1) to (23) leads to For each b ∈ {λ 2 1 , λ 1 λ 2 , λ 2 2 }, the geometric series in (B3) can be simplified via The latter expression follows because U i and U 0 are independent when conditioned on U i−1 . Thus, where the second, third, and fourth step follow from induction, Equation (12), and the independence of X 0 and Y 0 , respectively. In particular, it follows that which is the same as (41).
Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.